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Psl , Abstract 

The extremal index parameter 6 characterizes the degree of local dependence in the extremes of a sta- 
M-J \ tionary time series and has important applications in a number of areas, such as hydrology, telecommuni- 

cations, finance and environmental studies. In this study, a novel estimator for 6 based on the asymptotic 
scaling of block-maxima and resampling is introduced. It is shown to be consistent and asymptotically 
d • normal for a large class of tti— dependent time series. Further, a procedure for the automatic selection of 

{/3 \ its tuning parameter is developed and different types of confidence intervals that prove useful in practice 

proposed. The performance of the estimator is examined through simulations, which show its highly com- 
petitive behavior. Finally, the estimator is applied to three real data sets of daily crude oil prices, daily 
^ i returns of the S&P 500 stock index, and high-frequency, intra-day traded volumes of a stock. These 

00 ' applications demonstrate additional diagnostic features of statistical plots based on the new estimator. 

in ■ 

CO ■ Key words and phrases: Heavy tails, extremal index, resampling, permutation, bootstrap, asymptotic 

'nT \ normality. 
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. 1 Introduction 



Advances in computer technology have enabled the collection by research organizations and businesses of 
^ , large time series data sets. These data sets are primarily characterized by the fine granularity {high frequency) 

of the time intervals at which the observations are collected; for example, Internet traffic is sampled at 
millisecond intervals, while stock trades at every second. Such time series data are characterized by the 
presence of long range dependence (the autocorre lation function decays at a polyn omial rate) and the heavy 
tailed nature of the marginal distribution (see, e.g. Finkenstadt and RootzenI (|2004j)). In many cases, another 



^ 



phenomenon can be observed, namely the presence of clustering of very large or very small values {extremes) 
of the data (see e.g. Figure[T]). For example, in Internet traces this is the result of bursty arrivals, while in data 
on returns of a financial asset this is primarily due to the arrival of an external market shock. 

The daily log-returns of the spot price of West Texas Intermediate crude oil are shown for the period 
September 2006 - March 2007 in Figure |l(a)[ A pronounced temporal clustering of the extreme values can 
be seen, indicating the presence of local dependence in the extremes. Figure |l(b)| also demonstrates the 
substantial clustering of the extremely large traded volumes in the high-frequency data set of all intra-day 
trading activity of the Intel stock, for example. Such clustering behavior is of interest to subject matter 
experts and it has important implications in practice, since it concerns large consecutive changes associated 
with large financial Tosses' or 'gains'. Therefore, quantifying the nature of the dependence structure as well 
as the duration of extreme events becomes an essential part of the understanding of these time series data. 
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(a) Negative Log-returns of WTI Oil prices 
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(b) High-frequency Traded Volume 



Figure 1: Left plot: Negative log-returns of daily Oil prices. The upper and lower dashed lines correspond to the 0.90 and 0.10 
quantiles of the data respectively. Right plot: High-frequency traded volumes (in numbers of shares per transaction) of the Intel 
stock in November 16, 2005. Observe the clustering of extremes, particularly evident in the extreme price drops or 'losses' (above 
the horizontal dotted line) for the Oil data. The trades of the Intel stock with extremely large volumes also exhibit substantial 
clustering. 



The extremal index 9 is the main parameter that describes and quantifies the clustering characteristics of 
the extreme values in many stationary time series. Its formal definition is given next. Let X = {Xk}kez be 
a strictly stationary time series. Define the following quantities 



Mr,. 



max Xk 

l<k<n 



and M, 



lid 



max Xk, 

l<k<n 



where the XkS are independent and identically distributed (iid) random variables with the same distribution 
as the XkS. Formally, the time series X is said to have an extremal index 6, if for some norming sequences 
Cn > and (i„, we have 



P{c~^ (A'C'^ - dn) <x} ^ H{x) and ¥{c~^ (M„ - dn) < x} ^ H^ 



(1.1) 



where H{-) is a non-degenerate extreme value distribution (see e . g. p. 4 17 in Embrechts et al\ (Il997h ). 



An informal interpretation of 6 is given in iLeadbetter et al\ (119831) . namely 9 Rs(mean cluster size) 



\-i 



For example, for the crude oil log-returns discussed in Section |6l the extremal index is estimated to be 
around 0.6, which means that on the average, two large size 'losses' or 'gains' are recorded in a relatively 
short time span. The modeling and ana l ysis o f rare events (extremes ) has been an active area in probability 
and statistics (see e.g. lEmbrechts et al\ (Il997h . iBeirlant et al.\ (120041) ). In the context of extremes, the study 
and the estimation of the extremal index 9, plays an important role. 

In this paper, we focus on the non-degenerate case when the extremal index 9 is positive. Observe that 
in this case the same normalization and centering sequences for the partial maxima M„ and M^'^ above yield 
non-degenerate limit distributions. The extremal index takes values in the interval [0, 1]; a value close to 
indicates a very strong short range extremal dependence, while a value close to 1 a rather weak dependence. 
In fact, for iid X^'s, by (11.11 ). we have 9 = 1. The extremal index, however, chai^acterizes only the dependence 
of the extremes in the time series data and thus the data may still exhibit strong dependence, even though 
9^1. The case of ^ = is considered to be a pathological one. 



Theore t ical p roperties of the extremal index have been studied fairly extensively; (lO'Brienl (Il987h . 
Hsing et al\ (119881). and references therein). The problem of estimating 6 has also rece ived some attention in 
the lite rature: iHsingI (ll993h . lSmith and Weissmanl(ll994h . lWeissman and Novakl(ll998h and Ferro and Segers 
(120031). Applications of the ext remal index in various scientific area s include its incorporation in calc ulations 
of the Value-at-Risk measure (ILonginl (|20()0|) and Kliippelberg in JFinkenstadt and RootzenI (12004) ). in the 
study of the Nasdaq a nd S&P 500 indices (IGalbraith and Zemovi (120061) ) and in the study of GARCH pro- 
cesses (ILaurinil (12004]) ). The estimation of the extremal index 9 is an important practical problem with rapidly 
expanding areas of a pplication to finance, ins uran ce, hydrology and telecommunic ations, to name a few (for 
more details, see e.g. Embrechts et al\ (119971) and lFinkenstadt and RootzenI (120041) ). 

Most previous estimators of 9 exploit its connection to the point process of exceedances. In this study, 
we introduce a new method for estimating 9 based on the asymptotic scaling properties of block-maxima 
and resampling. Specifically, let Xi , . . . , X„ be a data sample from a heavy-tailed time series with positive 
extremal index 9. The maximum values of the data calculated over blocks of size m, scale at a rate m}/"^, 
where a > denotes the tail index of the marginal distribution of the data. Further, the normalized limit of 



the block maxima is proportional to 9^/°^a, where a 



l/o 



> is an asymptotic scale coefficient of the 



Xfc's. Thus, by examining a sequence of growing, dyadic block sizes m = 2^ , I < j < [log2 nj , j G N, and 
subsequently estimating the mean of logarithms of block-maxima one obtains estimating equations involving 
both the tail index a and the parameter 9^^"a. In these equations, the scale a and the extremal index 9 are, 
however, coupled. In principle, 9 can be calculated by solving an appropriate nonlinear equation, but the 
resulting estimate proves to be too variable. Hence, we resort to resampling. Specifically, we consider either 
a bootstrap or a random permutation sample of the original data and then apply the previous methodology. 
The resampled data behaves, asymptotically, as an independent sequence with unit extremal index, that yields 
a second set of estimating equations of the tail index a and the parameter a. By combining the resulting two 
estimating equations, one based on the original data and another based on the resampled data, we obtain a 
numerically stable estimate of 9. 

The resulting estimators for 9 are shown to be consistent and asymptotically normal for TTi— dependent 
sequences, while at the same time exhibiting good mean squared error properties in finite samples. An 
additional advantage of resampling is that it provides a supplementary way of calculating confidence intervals 
for 9. Resampling yields also new statistical plots, which provide further diagnostic tools for quantifying the 
clustering of extremes at various magnitudes. Simulation studies show that the proposed estimator is a 
competitive alternative to existing ones. Further, it provides new insights at the important parameter 9 from 
the perspective of resampling, it provides new graphical tools, that can be successfully used to analyze small 
as well as lai^ge data sets in practice. 

The remainder of the paper is organized as follows: Section |2] describes the proposed estimator. Its 
asymptotic properties are established in Section |3] Several methodological and algorithmic issues are dis- 
cussed in Section |4j while Section [5]focuses on the evaluation of the estimator thi^ough an extensive simula- 
tion study. Three important data sets of daily Crude Oil prices, the daily returns of the S&P 500 stock index, 
and the high-frequency traded volumes of the Intel stock are examined in Section |6l The proofs and some 
auxiliary results ai^e given in the Appendix. 



2 The max-spectrum based estimator of 6 



Let X = {Xfcjfcgz be a positive ergodic strictly stationary sequence with heavy tailed marginals and positive 
extremal index 9 > 0. Specifically, assume that F{Xk > x} = 1 — F{x) ~ cxx~'^, as x — )• oo for some 
a > and cx > 0, where a„ ~ 6„ means an/bn — )• 1, as n — )• oo. The parameter a corresponds to the tail 
index of the distribution. Given a sample path Xi, . . . , X^, we define the dyadic block maxima as follows: 



2-' 



D{j,k) := max X 



l<j<2J 



2i{k~l)+i 



yx, 



2i(k-l)+i 5 



(2.2) 



where j = 1, . . . , [log2 nj , A; = 1, . . . , [n/2-'J , and where [-J denotes the integer part function. For heavy- 
tailed X/^'s, relation dl.ll ) holds with H{x) = exp{— c^a;""}, x > and normalization constants Cn := 
n^/" and dn := 0. Therefore, 

2-j/"£)(j, k) A ^i/°aZ^/", as j -^ oo, for fixed k. (2.3) 

where Z is a standard 1— Frechet random variable, i.e. F{Z < z} = exp(— z~^), z > 0, and where a := Cj^" 
is the asymptotic scale coefficient of the X^'s. Due to the nature of the Frechet extreme value distribution, the 
extremal index parameter 9 appears in the scale coefficient of the limit distribution of the dependent maxima. 
This feature will play an important role in the estimation of 6 discussed below. 
Next, introduce the statistics 

rij 

Yj := -y2^og,{D{j,k)). (2.4) 

J k=l 

where Uj = \n/2^ \. The statistics Yj, j = 1 . . . , [log2(n)J will be referred to as the max-spectrum of the 
data, and the j's as scales. By the assumed ergodicity and provided that moments exist, for a fixed j, we get 

Yj ^ EYj = 3 /a + Elog2(2~j/"Z)(i, k)), as n ^ oo. (2.5) 

Assuming uniform integrability, relation (12.31 ). on the other hand, implies that 

Eyj~j7a + log2((T) + Elog2(Z)/a + log2(^)/a, asj^oo, (2.6) 

where a„ ~ hn means a„ — 6„ — ;■ 0, as n — )• oo. This indicates the existence of a linear relationship between 
the statistics Yj and j up to an eiTor term, which becomes negligible as Uj and j grow. The slope of a linear 
fit of Yj versus j yields an estimator of 1/a and thus a. Although our goal is to estimate 6, the estimation of 
the tail index a is an intermediate step and an integral part of our analysis. 

Observe that on the other hand for iid data, we have 6 = 1 and thus (12.61 ) becomes: 

EY,"'^ ~ j/a + log2(a) + Elog2(Z)/a, (2.7) 



where {Y,"'^} is the max-spectrum of an iid data set with the same distribution as the X^-'s. Relations (12.61 ) 
and (12.71 ) suggest a method to obtain an estimate of 9. Namely, resample the data, for example, by randomly 
drawing (with or without replacement) a sample XI, . . . , X^ of size k = k{n) from the set {Xi, . . . , X„}. 
Intuitively, this destroys the dependence structure of the data, resulting in an approximately independent 
sample with the same marginal distribution as the original stationary sequence. 

Let Y* be as in (12.41 ) where now the D{j, kys are based on the resampled data X*, . . . , X^. Since for an 
iid sequence we have 9 = 1, we expect the resampled sequence to have w 1, whereas a and a will remain 
unchanged. Thus, relation (12.61 ) becomes 

E[y;] ~ j/a + log2(a) + E[log2(Z)]/a, (2.8) 

where the term log2{9)/a is no longer present since log2(^ ?» 1) Ri 0. 
Thus, in view of (12.61 ) and ( 12.71 ). we have 



Y* « j/a + log2(cT) + Elog2(Z)/a, and Yj ^ j/a + log2(a) + Elog2(Z)/a + log2(0)/a. 

Taking the difference between the last two estimating equations, replacing a by its estimate a based on ( 12.61 ). 
and solving for 9 we obtain the following estimator for the extremal index: 

0'(j) = 2-"(j')(^*-^^). (2.9) 
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Figure 2: Left panel: The max-spectrum of X„ = max{|X„_i, |Z„}, 9 = 1/3, with Zj's iid standard 
1— Frechet {solid line) and the max-spectrum of iid copies of the X^'s {broken line). The two spectra are 
essentially linear with equal slopes. Right panel: boxplots of the 9{jys obtained from different resampled 
versions of a single path of the process. The circles indicate outliers located more than 1.5 fourth-spreads 
away from the sample median and the horizontal line is the theoretical value of 9 = 1/3. 



Observe that for a single data set, one can obtain a large set of estimates 9{j), based on different resam- 
pled versions of the data. Thus, resampling allows us to gauge the variability of the estimates as well as the 
range of scales j where the asymptotics in (12.61 ) and (12.71 ) become applicable. 

Figure |2]illustrates the main principle behind the proposed estimator. The left panel shows the combined 
max-spectra of a dependent sequence and an iid sample. The two max-spectra are parallel with equal slopes 
w l/a, since the marginal distributions behind the two spectra are the same. The difference is in the intercept 
and this is where the value of 9 is derived from. The right panel shows boxplots of 9{j) estimates obtained 
from 200 independent resampled versions of a single path of the process on the left. Observe that the medians 
of the 6'(j)'s closely follow the true value 9 = 1/3 over a range of scales (for more details, see Section |4] 
below). 
Remarks: 

(1) The statistics 1^'s in (12.41 ) are not only dependent in j, but more importantly, they have different variances 
in j since they involve averages of Uj ^ n/2^ terms. Thus, to reduce the variance in t he regression estirn ators 
of a, it is essential to use a weighted or generalized least squares method (see e.g. IStoev et al.\ (120060 . for 
more details). 

(2) The proposed resampling procedure avoids the problem of estimating the scale parameter a = Cj 
however, an estimate of a is still needed. The algorithmic implementation of the estimators 9{j) and other 
important practical issues are discussed below. The appropriate resampling sample size k{n), from the 
perspective of asymptotics, is o(yn) (see. Section [S]). 

(3) The estimate 9{j) depends on the scale j, as indicated. An automatic procedure for the choice of j is 
presented in Section IH 



1/q 



3 Theoretical properties 

Let X = {X/c}fcg2 be a strictly stationary time series with marginal heavy-tailed c.d.f. F and let also M„ = 

maxi<j<„ Xi = Vr=i -^i- ^^ "^ben have 

F„(x) := F{Mn < n^/"x} = exp{-c(n,x)x"°}, x G M, (3.10) 



for some function c{n, x) > 0, n G N. As in (12.31 ). if the time series X has a positive extremal index 
(9g (0,1], then 

n"i/"M„, A (0cx)^/"^^/", as n ^ oo, (3.11) 

where Z is a standard 1— Frechet variable: ¥{Z < x} = e~^ , x > 0. 

Our asymptotic results rely on the moment behavior of f{Mn./n^'°'), for certain deterministic functions 
/ and involve some additional technical conditions, outlined below (for more details, see the Appendix). 
Condition 1. There exists /3 > and R &M., such that 

\c{n,x) — 9cx\ < ci{x)n^^ , forallx>0, and ci{x) = 0{x~ ), x 10, (3.12) 

where G (0, 1]. 

Condition 2. F„(0) = and for all x > 0, 

c(n,x) > C2 min{l, x'^}, for some ^ £ {0,a), (3.13) 

for all sufficiently large n G N, where C2 > does not depend on n. 

Remarks: 

(1) The conditions ( 13.121 ) and ( 13.131 ) are not very stringent. For example, let 

Xfc =max{Zfc,Zfc_i,...,Zfc_m+i}, A; G Z, (3.14) 

where the Z^'s are independent, standard a— Frechet. We then have 

F{Mn < n^/^x} = P{Z_„+i < n^/^x, • • • , Z„ < n^/^x} = exp{-c(n, x)x-°}, 

where the function c(n, x) = [12 + 771 — l)/n = 1 + 0{l/n) does not depend on x and /3 = 1, in this 
simple case. Conditio ns 1 and 2 above hold for a more general class of moving maxima processes (see 



Hamidiehgfa/.l(l2007h '). 



(2) Condition 1 and relation ( 13.101 ) imply (13.111 ). that is, the extremal index of the time series X is precisely 

equal to 6 in (13.12b . Thus, (13.121) quantifies further the rate of the convergence in (13.111 ). 

Description of the asymptotic regime: To obtain the consistency of statistics based on the max-spectrum 

Y = {Yj}, we focus on the range of scales [j{n),i + j{n)], where £ G N is fixed and where j{n) — )• 00, as 

re — >• 00. We then define 

e _^ 

j=0 

where the weights Wi's are fixed and such that X^j^q ^i = and X^j^q ^^* ~ ^- ^^^ weights Wj 's can be 



obtain ed, for example, either from GLS or WLS regression of Yi_^,j(^n) versus i, forO < i < £ (see lStoev et al. 
(I2OO6I) . for more details). 

The estimator 6 in ( 12.91 ) involves both the max-spectrum Y of the dependent data and the max-spectrum 
Y* of the resampled data. Observe that 

9{j) = 2-'i(i)(f^*0)-^0)), where C*{j) := Y* - j/a and C{j) := Yj - j/a, (3.16) 

since trivially Y* — Yj = C*{j) — C{j). We will establish the asymptotic normality of 9{j) in thi^ee steps: 
(Step 1.) We first establish rates of convergence for the quantities a{j) and C{j), which are based on 

the max-spectrum {Yj}. 

(Step 2.) We then show that the (7*(j)'s are asymptotically normal (under certain conditions) in two 

resampling schemes: bootstrap and random permutations. 

(Step 3.) We finally combine the results from Steps 1. and 2. above to establish the asymptotic normality 

of^(j). 

Main results: We establish next the asymptotic normality of 6{j) defined in ( 13.161 ). by following the three 

steps outlined above. 

Step 1: The following result provides rates of convergence for a{j) and C{j). 



Proposition 3.1 Let Xi, . . . ,X„ be a sample from an m— dependent, strictly stationary time series X 
{Xk}kez> which satisfies Conditions 1 and! above. 

Then, for a{j) and C{j) in H3.15\) and (i3.16h we have, as n ^ oo 



(3.17) 



a{j) = a + Op{ ., , . ri «x ) + 0p(^T7^), and CiJ) = C + Op{ ., , ■ r^ «. ) + 0p(^t7^), 



w/f/i C = log2{6)/a + log2(cx)/a + Elog2(.Z')/a, where Z is a standard 1—Frechet variable. 

The proof of this result is given in the Appendix. Observe that Proposition 13.11 is valid for an arbitrary 
stationary ?n— dependent time series which satisfies (13.121 ) and ( 13.131 ). It is valid, in particular, for the simple 
process {^fclfcez in ( 13.141 ) and more generally for the moving maxima processes in (15.221 ). 
Step 2: We now employ resampling to obtain an approximately independent data sample X^ , ■ ■ ■ , X^ . Here, 
we consider two resampling schemes, the first based on bootstrap and the second on permutations. We 
then establish asymptotic normality results for the max-spectrum in both schemes. The sample X^ := 
Xi-^ , X| := Xj2 , • • • , X^ := Xi^, is a bootstrap sample from the data Xi, . . . , X^ if the indices ii, . . . ,ik 
are drawn randomly and with replacement from the set {1, . . . , n}. When these indices are drawn without 
replacement and A; < n, we obtain a. permutation sample. We need the following: 

Lemma 3.1 Let ii, . . . , i^ be a collection of randomly drawn indices either with replacement or without 
replacement from the set {1, ... , n}. For any fixed m £ N, we have 

P| min li-j/ — ii"\ > m\ > 1 — mk /in — k). 
i<j'<j"<k ' -^ ■' ' 

The proof is given in the Appendix. This result implies that for k{n) = o{^/n), n — ;• oo, the indices 
{ij, 1 < J < ^} are spaced by at least m— lags away from each other, with probability asymptotically equal 
to 1, as n — ;■ oo. Therefore, if the data Xi, . . . , X„ come from an m— dependent time series, for the purposes 
of asymptotics in distribution, both the bootstrap and the permutation samples of size k = o(\ /n) become 



essentially independent, with high probability, as n — )• oo. This fact and Proposition 4.2 in IStoev et al. 



(l2006h . readily imply the following result. 



Tiieorem 3.1 Let X = {Xi}i^z be a strictly stationary m— dependent time series, which satisfies Conditions 
1 and 2 above. Let X^ , . . . , X^ be either a bootstrap or a permutation sample from Xi , . . . , X„, where 
k{n) — )• oo is such that k{n) = o(n^'^), as n ^ oo, and let Y* be its corresponding max-spectrum. 

Letj{k) ^oo, n -^ oo, be such that k/2^^''^^^+^^^ + j{kf2^^''^ /k — >0,ask-^oo. 
Then, for C*{j) in dJ.i6D . we have 

k^iC*{j)-C*) ^M{0,a^,), asn^oo, (3.18) 

where kj = k{n)/2^^"-\ Here C* := log2{cx)/a + Elog2{Z)/a, and a^,, = a~^Var(log2 Z), where Z is 
a standard 1—Frechet variable. 

The proof is given in the Appendix. 

Step 3: The following Theorem is the main result of this Section. It combines the results of Proposition 13. II 

and Theorem |3.1| to establish the asymptotic normality of 9{j). 

Theorem 3.2 Assume the conditions of Theorem lj.il and let a{j) be as in AS. 151) , where Y is the max- 
spectrum of the data Xi, . . . , X„. Let also C{j) and C*{j) be as in i\3.161) , where Y* is the max— spectrum 
of either a bootstrap or a permutation sample X^ , • • • , X^ of the data. 



Let k{n) = o{y/n), n — )• oo and j{k) — )• oo, k ^ oo, be such that 

^/2JW(i+2min{i,/3})^^-(^„)22J(fc)/^^0, as k ^ OO, (3.19) 

Then, for 9{j) in Ii3.16i . we have 

y%{e{j)-9)^M{0,e^TT^/6), asn^oo, 

where kj = k{n)/2^^'^\ 

The proof of this result is given in the Appendix. A few important remarks follow. 
Remarks 

(1) Theorem 13.21 applies, for example, to the class of moving maxima processes in (15.221 ). under mild as- 
sumptions on the innovations Z^'s (see Conditions 1' & 2' below). It holds, for example, for Pareto, mixtures 
of Paieto or Frechet innovations. 

(2) Let 5 e (0,2miii{l,/3}) be arbitrary and suppose that fc/2i('^)(i+2min{i,/3}) ^ j^~s ^ A; ^ oo. We then 
have 2J(^) ~ A;{i+'5)/(i+2 min{i,/3})^ A; -;. oo which, since 6 <2 min{l, /3}, imphes that relation (IXT91 ) holds. 
This yields the rate kj ~ /(;(2min{i,/3}+5)/{i+2min{i,/3}) ^^ Theorem 1X21 Since k = o{^) and since 5 > 

min{l,,fl} 

can be taken arbitrarily small, we can achieve rates up to n(i+2min{i./3}) . For example, if /3 > 1/2 the rate of 
n}''^ is possible while the best possible rate is o{'n}i^). 

4 Implementation issues 

We present next an algorithmic implementation for the proposed estimator of 6 and discuss its main features. 
We then propose a second algorithm for the automatic selection of scales. 

In Theorem 13.21 we only consider resampled sets from the data of size k{n) = o{y/n). In practice, we 
found that the estimators of 6 continue to work well even if one considers random permutations of the entire 
data sample of size k{n) = n. Using bootstrap instead of permutation samples, results in estimates 6{j) with 
larger variances and bias (for large j's), especially for small sample sizes. Thus, in the sequel, we focus on 
permutation based resampling and utilize the entire data set. 
Algorithm 1: (estimation of 9) 

1. Compute the 1^'s and the a(j)'s as in (12.41 ) and (13.151 ) based on the original data. 

2. Randomly permute (i.e. shuffle) the data, Nm times and collect the Nm statistics Y*. 

3. Find the Nin differences of Y* — Yj and compute the sample mean for the positive differences only: 

A(j)=mean{y;-y,}+. 

4. Obtain the estimates of 9 for each scale j: 9{j) = max{2~"(-')^(-'\ 1}. 

5. Repeat steps 2, 3, and 4, Nout number of times and collect the 9{j) values. 

6. Produce a sequence of 9{j) boxplots from the Nout available values, per each scale j. 

7. Visually inspect the boxplots of 9{j) and select a range of scales where the medians of the boxplots 
stabilize. Estimate 9 by using the median values from this range of scales. 
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Figure 3: Estimation of 6 for the process Xn = max{iX„_i, |Z„}, 9 = 1/2, Zj iid standard 1— Frechet, 
with sample size of n=2^^, Nout = 500 and Nin = 25. Left panel: Boxplots of ^(j)'s with the last two scales 
omitted. Right panel: A 'heat map' visualizing the Kruskal-Wallis test for the automatic selection of scales 
- black corresponds to ]5— values greater than 0.05. 



In the following remarks we explain and justify the steps in the above algorithm. 

Discussion of Algorithm 1: 

Step 1: The estimate a.{j) is based on the range of scales j, . . . ,j + I, where j + i = \\og2{n)\ — 1 is 
chosen to be the second lai^gest available scale in the data. In practice, we discard the highest scale since it 
involves an average of at most two block-maxima. We recomme nd using either gen eralized least squares with 
the asymptotic co variance matrix for the max-spectrum given in lStoev et al\ (120061) or weighted least squares 
which account for the fact that Var(l^) ex 1/nj oc 2^ Both approaches are comparable and considerably 
better than ordinary least squares regression, which should not be used. 

Steps 2 & 3: We introduce an inner loop with Nm iterations to reduce the variability of Y* — Yj. This 
considerably improves the variance of the 6 estimates. On step 3, we average only the positive differences 
Y* — Yj since by relations ( 12.61 ) and (12.81 ). we have KY* > KYj. Our experiments indicate that replacing the 
"mean" by "medi an" in step 3 yield s simil ar' results. 

Step 4: As in iFerro and SegersI (l2003h . we take the minimum of the calculated estimate and 1 to ensure 
that^(i)G [0,1]. 

Step 5: This step yields a sample of Nout estimates of 9 for each scale j. The practical choice of the 
parameters Nout and Nm is discussed in Section |5] 

Step 6: In practice, the estimation of 9 requires selecting the range of scales, where the best bias/variance 
trade-off is achieved. Estimating 9 over the larger scales j (larger block sizes) involves lower bias, but leads 
to larger variance as the number of block-maxima is reduced. At lower scales j (smaller block sizes) the bias 
grows but the variance is reduced (see Figure [3]). In general, reliable estimates of 9 can be obtained from the 
middle range of scales. The choice of the scales j is addressed in the sequel. 

Figure |3] (left panel) illustrates the above algorithm over a simulated process with known extremal index 
9 = 1/2. A stable range of scales 4 to 7 can be observed. In practice, we recommend taking the median of 
the sample of the pooled Nout estimates 9{j) from each one of the scales j in the stable range. In this case 
we obtained a point estimate of 0.52. One can also obtain an empirical 95% confidence interval, based on 
0.025-th and 0.975-th empirical quantiles of the pooled 9{j) values to obtain (0.40, 0.62) (see also relations 
K23\i and KTM below). 



The selection of the stable range of scales j in Step 6 of Algorithm 1 is subjective. We propose next an 
automated procedure for selecting the range of scales, based on the Kruskal-Wallis test. 

Algorithm 2: (automatic selection of scales) 

1 . For every given range ji < j < J2 , Ji < J2 of possible consecutive scales in the data, perform a 
Kruskal-Wallis test for equality of the medians, based on the samples of Nout values of 0{j). 

2. Consider the array of p— values: p{ji,J2) resulting from the tests in Step 1. Declare the medians over 
the range [ji, ^'2] 'statistically different' if p is less than a prescribed significance threshold. 

3. Produce a pooled estimate of 6 based on the longest scale range where the medians are 'statistically 
equal'. 

4. If there are ties in Step 3, pick the range starting at the lowest scale. If all medians are 'statistically 
different' , pick the middle scale and follow up by a visual inspection of the results. 

The proposed automatic scale selection procedure is evaluated in Section |5] One possible method to visuaUze 
the results of this analysis is to construct a 'heat map' of the p- values for the Kruskal-Wallis tests - see Figure 
|3](right panel). The axes correspond to scales ji and J2 ; the regions in black indicate ranges of scales [j'l , J2] 
with p— values greater than 0.05. This heat map shows that the medians over the scale range [ji, J2] = [5, 7] 
are 'statistically equal' at a level of 5%. A point estimate based on the pooled values from scales 5 to 7 is 
0.52 with an empirical 95% confidence interval of (0.39, 0.63). 

5 Performance evaluation 



We present ne xt the results of a simula tion study and comment on the performance of the max-spectrum, the 
Ferro-Segers (IFerro and SegersI (120031) ) and the runs (lO'BrienI (119871) ) estimators for the extremal index. We 
briefly summarize these two competing estimators next: 



The first estimator is based on the characterization of the extremal index given by lO'Brienl (119871) . In 
this characterization, 6 is expressed as the limiting probability that an exceedance is followed by a run of 
observations below a high threshold n„: 



9 = lim P{\/ Xj < Un\Xi > Un}, 



J=2 

where r„ = o{n) is the length of runs of values of the process falling below the threshold given that an 
exceedance has occurred. This characterization motivates the definition of the runs estimator for a fixed high 
threshold u and a specified runs length r: 



e; 



:[i(x,>n>vi5Vi 



Xr 



E-=[i(^.>n) 



(5.20) 



The runs estimator is asymptotically normal and consistent. See lWeissman and NovakI (119981) and references 
therein for additional information. 



The second estimator is due to iFerro and SegersI (120031) . An interesting aspect of this estimator is that 
it does not require an auxiliary parameter (run length in the c ase of the runs est i mator ). However, one still 
has to choose the threshold. Using a point process approach, IFerro and SegersI (|2003|) show that the inter- 
exceedance times - time differences between successive values above a threshold - of the extreme values 
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normalized by F{un) converge in distribution to a random variable Tq with a mass of 1 — at t = and an 
exponential distribution with rate equal to on t > 0. Using a moment estimator, they first obtain: 



Af-l, 



2(Er=i TO 



(A^-l)(Ef=7'7^?)' 



where {Tj} are the inter-exceedance times and N is the number of exceedances of a fixed high threshold u. 
A bias corrected version gives, 



^^f-l 



2(Er=i (7^.-1))^ 



{N-l){Y.t-^\T,-l){T^-2))^ 

To obtain the final form of the estimator, a further adjustment is made to ensure that the values of the estimator 
lie between and 1 : 

^ ^flA^i ifmax{r, :l<i<iV-l}<2, 
^/•^ I 1 A §2 if maxlTi : 1 < i < iV - 1} > 2. 

The Ferro-Segers estimator is consistent for m-dependent strictly stationary sequences. 

Next, we discuss three types of processes, used in the simulation study, for which the extremal index is 
given in closed form. 
• The max-autoregressive (armax) process of order one is defined as: 

Xn = max{6X„_i, (1 — b)Zn}, where < b < 1, 

and where {Zn}nez is an iid sequence of standard a — Frechet random vari ables. For such processes 6 = 



1 — &" can take any value in the interval (0, 1] (see e.g. iBeirlant et al.\ (120041) for additional information). 
• The linear process {Yn}, n E Z is defined as: 

Yn = 2_, '^jZn-j, n e Z, where NJ \Tpj\^ < oo, for some < 5 < min{l, a}. 

Here {Zn}nez is an iid sequence of heavy-tailed innovations with exponent a > 0. When the Z„'s are 
symmetric, we have 9 = (V'" + V'-) /IIV'IIS' wh ere 'i^+ = maxjfV;, V ), -0- = maxj(— -0j V 0), and 
X^igz IV'il" (see> e.g. Corollary 5.5.3 in Embrechts et al.\ (119971) ). We will use iid t-distributed 



a 

a — L^ji 



innovations Z„'s where the degrees of freedom parameter is also equal to the tail index a. 

• The moving maxima process X = {Xk}k&z is defined as: 

Xk := max aiZk-i+i, k £ Z, (5.22) 

l<i<m 

with some coefficients at > 0, i = 1,. . . ,7n, and m > I, where the Z^s are iid, positive heavy-tailed 
random variables with tail exponent a. The extremal index 9 of X is: 6 = maxi<j<m af/ Yl^i ^f • 
Simulation setup: For brevity, we present selected results for the processes under consideration that demon- 
strate best the behavior of the various estimators. 

o Xn = max{6X„_i, (1 — 6)Z„}, with Zi iid standard 1— Frechet. 

o Yn = 0.50Z„ + 0.20Zn-i + O.WZn-2, with Zi iid t-distributed with a degrees of freedom. 

oWn = max{0.80Z„, 0.20Z„_i, 0.40Z„_2}, with Zi iid Pareto with tail index a. 

• Parameters: For the armax processes, we fix the tail index at a = 1 and vary the coefficient b to obtain a 
range of 9 values. The coefficients of the linear and moving maxima processes ai^e fixed (as indicated above), 
and the values of a for the Z^'s are varied to obtain a range of 9 values. For all processes, other choices of 
the parameters produced analogous results. For each type of process, 500 independent sample paths were 
generated of length 2^'^ = 8192 for the armax and moving max processes and 2^^ = 16384 for the linear 
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processes. For each generated sample path, the Ferro-Segers, the runs 1, 5, and 9 at each selected threshold 
were computed. The proposed max-spectrum based estimator was computed using both GLS and WLS 
and setting Nin = 25. The threshold (FeiTO-Segers and runs estimators) and the scale (proposed estimator) 
parameters achieving the best Root-Mean-Square-Error (RMSE) are reported in Tables [T]-[3] 

The results demonstrate that the proposed max-spectrum estimator exhibits a good overall performance 
in terms of RMSE and in many settings it outperforms the Ferro-Segers estimator. The GLS and WLS 
variants produce similar results. The runs estimator performs exceptionally well for the armax process, // 
the 'correct' run-length parameter is specified. However, it is quite sensitive to the type of process and 
to the choice of the run-length parameter employed. The max-spectrum and Ferro-Segers estimators are 
significantly more robust than the runs estimator to the choice of the model. 



e 


a 


GLS 


WLS 


F/S 


Runs — 1 


Buns — 5 


Runs — 9 


0.10 


1.00 


0.0189 


0.0197 


0.0140 


0.0109 


0.0127 


0.0137 


0.20 


1.00 


0.0226 


0.0256 


0.0206 


0.0164 


0.0218 


0.0247 


0.30 


1.00 


0.0325 


0.0291 


0.0272 


0.0223 


0.0298 


0.0343 


0.40 


1.00 


0.0334 


0.0290 


0.0306 


0.0272 


0.0381 


0.0440 


0.50 


1.00 


0.0335 


0.0308 


0.0316 


0.0302 


0.0436 


0.0520 


0.60 


1.00 


0.0350 


0.0310 


0.0326 


0.0316 


0.0485 


0.0569 


0.70 


1.00 


0.0323 


0.0285 


0.0348 


0.0327 


0.0493 


0.0584 


0.80 


1.00 


0.0274 


0.0243 


0.0365 


0.0323 


0.0508 


0.0638 


0.90 


1.00 


0.0212 


0.0206 


0.0363 


0.0284 


0.0506 


0.0621 



Table 1: RMSE values for Xn = max{6X„_i, (1 — 6)Z„}, with Zi iid standard 1— Frechet. The first column 
contains the 9 values. The last 6 columns contain the best RMSE values for the max-spectrum estimates via 
GLS, WLS, and the competitors. The sample sizes were fixed at 2^"^, with Nout = 500, and Nin = 25. 
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Table 2: RMSE values for y„ = 0.50Z„ + 0.20Z„_i + 0.10Z„_2, with Zi iid t-distributed. The first column 
contains the 9 values. The tail index values are in the second column. The last 6 columns contain the best 
RMSE values for the max-spectrum estimates via GLS, WLS, and the competitors. The sample sizes were 
fixed at 2^"^, with Nout = 500, and Nin = 25. 

Figure HI shows boxplots of 500 independent realizations of the WLS variant of the max-spectrum esti- 
mator, computed for a linear process with 9 = 0.625. The boxplots for the WLS (GLS boxplots were very 
similar) method and the median of the estimates of the Ferro-Segers and the runs estimators per threshold 
are shown. The runs estimator is quite sensitive to the choice of the run-length and exhibits systematic bias. 
The Ferro-Segers and max-spectrum estimators are more robust and do not exhibit such strong bias, a fact 
observed in numerous other experimental settings. 

On the choice of Nin and Nout' The choice of the resampling parameters Nin and Nout in Step 5 of 
Algorithm 1 involves an intricate bias-variance trade off. Our experience with various sample sizes n and 
values for Nin and Nout shows that larger values for Nin lead to smaller variances but larger bias. Extremely 
large values of Nout may not yield a good resampling approximation of the distribution of the ^(j)'s. In 
real data and/or for smaller samples (e.g. up to several thousands), we recommend using Nin = 1 and 
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Table 3: RMSE values for W^ = max{0.80Z„, 0.20Z„„i, 0.40Z„_2}, with Zj lid Pareto. The first column 
contains the 6 values. The tail index values are in the second column. The last 6 columns contain the best 
RMSE values for the max-spectrum estimates via GLS, WLS, and the competitors. The sample sizes were 
fixed at 2^^, with Nout = 500, and iVi„ = 25. 
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Figure 4: WLS simulation results for Yn = O.bOZn + 0.20Z„_i + 0.10Z„_2, = 0.625, Zi iid t-distributed 
with df = a = LOO, and a sample size of 2^^, with Nout = 500 and Nin = 25. Left panel: Boxplots of 
max-spectrum 9. Right panel: 6 obtained form the runs and FeiTO-Segers estimators. In both plots, the solid 
horizontal line corresponds io = 0.625. 



Nout = 200, for example. Using N^ = 1 yields slightly larger variances, leading to wider confidence 
intervals, but prevents missing the 'true value' due to elevated bias. For moderate and large samples, and 
if computation time may be of a lesser concern, we recommend using N^ > 1. The choice of Nin > 1. 
reduces the variance of the estimators, and as long as the value Nm x Nout is not too large, relative to the 
available sample size, this does not lead to elevated bias. 

Automatic selection of scales: We illustrate next the performance of the automatic selection procedure, 
introduced in Section |4] We use a subset of the armax, linear and moving maxima processes, described 
in the simulation setup above. As before, for each process, we generate 500 independent realizations, of 
length 2^'^ = 8192 for the armax (AM) and moving maxima (MM) processes and 2^^ = 16384 for the linear 
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Table 4: Best RMSE values versus the RMSE from the automatic scale selection procedure. 

processes (LP). We now use Nout = 200 and Nm = 1 and thus we obtain 200 dependent estimates of per 
scale j, for each sample path. We apply the automatic selection procedure based on the Kruskal-Wallis test 
(at a level of 5%) for each set of 200 resampled 9 estimates. We thus obtain a single 6 estimate per simulated 
path. 

This procedure is repeated for each independent realization and RMSE values are computed based on 
the obtained 9 estimates from the automatic procedure. We report the best RMSE value (lowest RMSE value 
among scales), the median and the standard deviation of the estimates based on the automatic procedure and 
the same values corresponding to the scale at which the best RMSE value was obtained (as in Tables [l]-[3]l. 

Table|4]indicates that the automatic selection procedure performs very well in terms of bias (as compai^ed 
to the best-RMSE scale). The RMSE values for the automatic selection method are larger than the best- 
scale-RMSE values. This is due to the larger variance as seen from the reported standard deviations. Such a 
behavior is to be expected since the automatic selection procedure does not involve any knowledge of the true 
value of 9. In practice, since 9 is unknown, one cannot identify the best scale j and hence one cannot achieve 
the best-RMSE. In such a setting the automatic selection procedure appears to perform well, by producing 
estimates with low bias and paying a small price in higher variability. 

Confidence Intervals: The following variants of confidence intervals were constructed and compared. The 
first, based on asymptotic normality (see Theorem 13. 2I ). is given by 



9{j) ± 2;(l-g)/2^(j>Y'l/6?^i> 



(5.23) 



where Z(^i_qy2 is a (1 — q)/2—t\i quantile of the standard normal distribution and n and Uj = \n/2^ \ are the 
total sample size and the number of block-maxima involved in the calculation of the Yj statistic, respectively. 
Table[5]displays coverage probabilities for nominal levels .05 and .10 for scales j between 4 and 8, where the 
9{j) estimates typically stabilize. These results ai^e based on 500 independent realizations for each process. 
The second type of confidence intervals are based on resampled versions of a single sample path of the 
data. The computed 9 estimates are pooled across a range of scales with reasonable estimates, and then take 
the appropriate empirical quantiles: 



(6'(ji,i2)(i^), 6l(ji,i2)(i±a 



(5.24) 



where 9{ji,J2)[r) represents the empirical r— th quantile of the pooled 9{j) values across scales ji < j < J2- 
The coverage probabilities based on (15.241 ) are reported in Tabled 

Tables |5]-|6] show coverage probabilities for the middle range of scales. The confidence intervals based 
on the asymptotic approximation tend to over-cover the parameter 9, as compared to the nominal level. On 
the other hand, the resampled based confidence intervals tend to undercover 9, on the average. Further, expe- 
rience shows that for lower scales, the coverage probabilities suffer substantially due to bias; however, as j 
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Table 5: Coverage probabilities for a selected set of processes using equation (I5.23I I. 
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Table 6: Coverage probabilities for a selected set of processes using equation (I5.24I ). 

increases the results rapidly improve. These results indicate that the information from the two types of confi- 
dence intervals, combined, provides useful ball-park estimates for accurate confidence interval estimates for 
9. The difficult problem of obtaining accurate confidence intervals for 6 which work well in practice will be 
the focus of future work. 



6 Applications 

Crude Oil Data: The daily log returns of West Texas Intermediate (WTI) crude oil prices from Jan- 
uary 2, 1986 to March 6, 2007 (5744 observations) are analyzed and the extremal index estimated. Note 
that the daily log returns (refeiTcd as returns henceforth) are approximately equal to the daily percentage 
changes in the price. WTI represents a benchmark against which all oil bound for the US is priced at and 
hence its mai^ket is deep and liquid. The data were obtained from Energy Inform ation Administration (see 
http://www.eia.doe.gov/ 1. For a useful reference on oil markets see lOemanl ((20051). 

Figure [5] shows a plot of the data and the corresponding returns. The return series appears to be ap- 
proximately stationary, with the exception of a few instances, the result of events of major economic impact. 
In the top panel, the run up of the oil prices before the first Persian Gulf war can be seen, together with its 
subsequent rapid drop once it became apparent that the coalition forces would prevail. A similar pattern is 
observed at the onset of the recent Iraq war. The run up in oil prices over the course of the last three years, 
accentuated due to sustained demand and growth is also evident in the plot, together with their steep drop 
starting in mid- July 2008. 

Analysis of the tail behavior of the data by examining the max-spectrum and Hill estimators indicate 
a value of a ss 3 and 2.5 for the right and left tails, respectively. We study separately the time series of 
positive (right tail of the distribution) and negative (left tail) returns. This is motivated by the empirical fact 
that positive and negative returns exhibit different behavior. 
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Figure 5: Top Plot: West Texas Intermediate (WTI) crude oil prices from January 2, 1986 to October 7, 2008. 
Bottom Plot: The daily log returns of oil prices for the same period. 



We estimate next the extremal index 9 of the returns using the max-spectrum, the runs 1, 5, 9 and the 
Ferro-Segers estimators. The results are shown in Figure [6] The max-spectrum estimates of were obtained 
by setting Nout = 200 and iVj„ = 1 and using WLS. It can be seen that stable 6 estimates for the right tail can 
be obtained at scales j = 4 to j = 5. Pooling these results yield a value for = 0.60 with a 95% confidence 
interval of (0.55, 0.65) based on equation (15.241 ). It should be noted that the automatic selection procedure 
chooses scale j = 5 for the right tail, which gives comparable results. The 95% confidence interval obtained 
from (15.231 ) is (0.59, 0.61). The main reason that these confidence intervals are narrow is because they ignore 
the uncertainty regarding scale selection. For the left tail, we choose the median value at scales j = 5 to 
J = 6 and to obtain a pooled estimate of 0.53 with a 95% confidence interval of (0.47, 0.61) using (15.241 ) and 
(0.51, 0.55) using (l5^ and j = 5. 

A reasonably stable estimate obtained from the Ferro-Segers procedure is around 0.50 for the right tail 
and 0.42 for the left one. However, another choice for the left tail is 0.53, corresponding to the range of 0.90th 
to 0.92nd quantiles. The max-spectrum and Ferro-Segers estimates are to some extent in agreement for the 
right tail and possibly for the left tail as well, depending on the choice of a stable range for the Ferro-Segers 
estimate. On the other hand, the results of the runs-1 estimator are highly suspect. The results of the runs-1 
indicate little or no clustering of extremes (as 6 ss 1). The fact that runs-1 fails to capture the clustering may 
be explained by the behavior of financial returns, where one extremely large positive return is commonly 
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Figure 6: Top Row: Estimates of 6 for the right tail. The left panel is the max spectrum estimates. The right 
panel is the Ferro-Segers and runs estimates. The solid horizontal line in both plots corresponds to the max 
spectrum point estimate of 0.60. Bottom Row: Estimates of 9 for the left tail. The left panel is the max 
spectrum estimates. The right panel is the Ferro-Segers and runs estimates. The solid horizontal line in both 
plots corresponds to the max spectrum point estimate of 0.60. 



followed by a large negative return. Thus, runs- 1 often identifies clusters with a single extreme value, as in 
the case of independent data. Increasing the number of the run length parameter yields estimates more in 
agreement with the other two procedures. The results strongly suggest clustering of large losses and gains 
that can in turn have serious consequences in terms of risk exposure of portfolios that include WTI. 

The next two examples illustrate our extremal index estimator over two financial data sets: (i) Daily returns 
of the S& P 500 stock index and (ii) high-frequency, tick-by-tick volumes of a traded stock. The extremal 
index estimates behave differently in these two settings over the largest scales j. We discuss how the plot 
of the 0(j)'s, as a function of j, may be used to detect different regimes of clustering of extremes. For 
simplicity, we focus on 0(j)'s obtained by weighted least squares, Nin = 1 and Nout = 200 independent 
permutations of the data. The results with other choices of the parameters, or ones involving bootstrap instead 
of permutations are similar. 

Daily S&P 500 returns (1960-2007): Figure |7] shows the extremal index estimates of the gains and losses 
for the daily returns of the S&P 500 stock index. The top panel indicates that both the gains and the losses 
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Daily S&P 500 Returns (Jan 1, 1960 — Dec 31, 2007) 




1960 1965 1970 1975 1980 1985 1990 1995 2000 2005 



Gains (Positive Returns 




6 8 
Scales 
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Figure 7: Top panel: S&P 500 index (daily returns). Bottom panels: boxplots of the 0(j)'s obtained from 
Nout = 200 independent permutations of the data. (The coiTesponding bootstrap-based versions are similar 
and omitted for brevity.) The left panel con^esponds to the time series of positive returns (gains); the right 
panel to the time series of the absolute values of the negative returns (losses). Observe that the extremal 
index estimates over the largest scales approach 1 for both the gains and the losses. 



time series have heavy tails. Indeed, max-spectrum estimates of the left- and right-tail exponents yield 



aic 



2.958 and a 



gam 



3.553. These values confirm the common observation that the tails of the losses 



are slightly heavier than the tails of the gains (see e.g. Table 1 in lGalbraith and Zemovl (120061) ). The bottom 
two panels on Figure |7] show boxplots of resampled estimates of the extremal index as a function of the 
scale j. We studied separately the time series of the gains or positive returns (left panel) and the losses (right 
panel). 

For the gains, the box-plots stabilize at scales j = 7 to 9 (as also confirmed by the Kruskal-Wallis 
analysis). As for the oil data, by pooling the 0{j)'s for this range of scales, we obtain Ogains ~ 0.31 with 
95% confidence interval (0.23, 0.39) based on (15.23b and scale j = 7. The confidence interval based on 
(15.241) and pooling scales j = 7 to 9 together is (0.16, 0.43). Similar analysis for the losses shows that the 
0(j)'s stabilize over the range j = 6 to 8, and the pooled estimate is Oioss ~ 0.416. The 95% confidence 
interval based on (15.231 ) and scale j = 6 is (0.34, 0.49), and the one based on the pooled scales and (15.241 ) 
is (0.34,0.50). Our results are in agreement with t he Ferro-Segers and runs est imates (for 200 threshold 
exceedences therein) of Oioss reported in Figure 3b of iGalbraith and Zemovl (l2006h . 

Our analysis indicates that the extremal indices of both the gains and the losses time series of daily 
S&P 500 returns are lower than the estimates corresponding to the Oil data set. This, as before, shows that 
extremes of the gains and the losses exhibit significant clustering, which can have far reaching consequences 
in terms of risk management. In contrast to the Oil data set, however, the left tails (losses) have slightly 
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Figure 8: Top panels: High-frequency volume time series for the Intel stock during the days of Nov 16 (left 
panel) and Nov 22 (right panel) in 2005. The data are ordered in time and every value corresponds to the 
number of traded shares during one transaction. There are 75, 993 trades in Nov 16 and 1 19, 840 trades in 
Nov 22. Bottom panels: boxplots of the 0(j)'s obtained from Nout = 200 independent permutations of the 
data corresponding to the top two data sets. 



higher extremal index than the right tails (gains). This results in slightly more temporal clustering of the 
extreme gains as compared to the extreme losses. Indeed, the expected cluster sizes for the extreme gains 
and losses are about 1/ Ogams ~ 3.23 and l/6ioss ~ 2.5, respectively. 

The above estimates yield a single value for the extremal index 6 based on a judicious choice of scales. 
In practice, the boxplots for the entire range of available scales, however, can also give important insights. In 
the above analysis, we focus on the range of scales j = 6 to 9, which roughly corresponds to focusing on the 
range of probabilities [0.9844,0.9980]. Therefore, from a physical perspective, the extremal index estimates 
are useful and applicable for the extremes occurring on a time scale of up to 2^ = 512 trading days or up to 
2 years, on the average. Over a range of 1 to 2 years, one can indeed expect that the S&P 500 returns are 
approximately stationary and our theory applies. Significant structural changes and cycles in the economy, 
however, lead to non-stationarity over longer periods of time. Therefore, the extremal index estimates 6'(j)'s 
for scales j > 10 should also be considered, but interpreted with care. Indeed, as seen from Figure |7J the 
estimates ^(j)'s approach 1, as j grows beyond 9. For the largest scales (j = 11 or 12), the extremal indices 
of the gains and losses are essentially 1. Since 6 measures the degree of clustering or dependence of extremes, 
this suggests that the largest extremes of the S&P 500 returns are perhaps weakly dependent or independent. 
Indeed, the largest extremes correspond to select few financial crashes or periods of extreme volatility. These 
events occur far apart in time, they do not cluster, and therefore 9{j) ?» 1. 

High-frequency Stock Volume: Figure [8]illustrates the extremal index estimators over two high-frequency 
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data sets of traded volumes. The time series consist of the number of traded shai^es of Intel Inc. for each 
and every transaction occurring during two typical days of trading (Nov 16 and 22 in 2005). The data was 
obtained from the TAQ (trades and quotes) data base of consolidated trades from the NYSE and NASDAQ 



exchanges (see lWharton Research Data Service I (lurll) ). One easily sees that reasonable extremal index esti- 
mates for Nov 16 and Nov 22 are about 9 « 0.8. The corresponding boxplots are stable over a wide range 
of scales (e.g. j = 6 to 10). Beyond scales j = 10, however, one should interpret the estimates 0(j)'s with 
care. Indeed, about 2^^ = 1024 trades of the highly liquid Intel stock occur over the time scale of about 4 
to 5 minutes (depending on the time of the day and the amount of trading during the day). Over the time 
scale of 4 to 5 minutes, one can expect to have relatively stationary trading patterns. Longer periods of time, 
however, involve intra-day seasonality and other intricate non-stationarity due to arrival of new information. 
Therefore, the boxplots of the ^(j)'s involve a 'change of regime' for scales j > 10. This change of regime 
is relatively abrupt for the November 22 data set and gradual but systematic for the November 16 data. In 
both cases, the extremal index estimates drastically approach zero, as the scales become more extreme. This 
implies that the clustering of the largest extremes is substantially more pronounced than that of the moderate 
extremes. This effect is also confirmed by the top plots where extremely large volumes are traded in batches. 
This phenomenon is in stark contrast with the observed weakening of the clustering for the returns data in 
Figure |7] This difference may be attributed to the difference in the nature of the extreme traded volumes and 
extreme stock returns. Large returns/losses in the S&P 500 index are global, market-wide events that are 
hard to control or manipulate. Extremely large volumes, on the other hand, are traded by a select individual 
players in the market. Typically, large transactions are split in batches and traded systematically over a longer 
period of time to minimize the negative effect of a large volume trade on the stock price. 

7 Concluding Remarks 

In this paper, a novel procedure for estimating the extremal index of stationary time series was introduced. It 
is based on scaling properties of block-maxima and on resampling. Under certain mild regularity conditions, 
its consistency and asymptotic normality were established for ?7i-dependent time series, that provides one 
way of constructing confidence intervals. A more computationally intensive procedure based on resampling 
is also presented for the same task. A comprehensive simulation study shows the competitive nature of 
the proposed estimator. Finally, the estimator is illustrated on a number of financial data sets that also 
demonstrate additional diagnostic features of various statistical plots based on the new estimator. 
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Appendix 

Proposition 7.1 Suppose that f : (0, oo) — > M is an absolutely continuous function on any compact interval [a, b] C 
(0, oo), and such that f{x) = /{xq) + J f'{u)du, x > Ofor some (any) Xq > 0. 
Let for some to G K and S > 0, 

x"Vix)\ + csssupo<y<,2/"|/'(y)| -^ 0, as x i 0, (7.25) 

x-"|/(x)| + xi+*csssup^>,j/-"|/'(y)| -^0, fl. X ^ oo. (7.26) 

Suppose also that the time series X = {Xn}n<£Z satisfies Conditions 1 and 2, where Ci{x) is such that: 



ci{x)x-"'\f{x)\dx <oo. (7.27) 

Then, E\f{Mn)\ < oo, for all sufficiently large n G N, and for some Cf > 0, independent of n, 

|E/(A/„/ni/") - E/(Z)| < Cfu-P, (7.28) 

where Z is an a—Frechet variable with scale coefficient a :— c-^"'. 
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Proof: The proof is similar to the proof of Theorem 3.1 in lStoev et al.\ (120061) . Indeed, as in the above reference, one 



can show that E|/(Z)| < oo and E|/(M„)| < oo, for all sufficiently large n. Further, by using the conditions ( 17.25b 
and ( 17.261 ) and integration by parts, we have that 

/"OO 

E/(Af„/ni/")-E/(Z)= / {G{x)-Fn{x))f'{x)dx, (7.29) 



where F„(x) := P{A/„/ni/" < x} and G{x) = F{Z < x}. Since F„(.x) = g-'^^"'^)^ °, by the mean value theorem, 
we have 

\G{x)-F^{x)\ = |e-'=-'^^"° _g-c(n,aO:r-°| < |c(n, x) - cjf l^-^e" """{^'='^- ' '=("'^)>^"° 
< n-^ci{x)x-'' ('e-'^^^"'""^' + e-""-^^"") , 

where in the last inequality, we used Relations ( 13.121 ) and ( 13.131 ). 
Thus, by ( 17.291 ), we have that 

/•oo 

|E/(M„/ni/")-E/(Z)| < n~^ ci(x)2;~"|/'(x)K e''^^^"'""^' + e^'^^^^Jda: 

.: "-(/'+f ). a.30, 

The last integral is finite. Indeed, since the exponential terms above are bounded. Relation ( 17.271 ) implies that the 
integral "J^ " is finite. On the other hand, conditions ( |3.12| i and ( |7.25t imply that, ci{x)\f'{x)\ = 0{x~^'), x I 0, 

for some R G M.. However, for all p > 0, we have (e"'^^^' + e^'^^^ ) = o{xP), x I Q, since a — 7 > 0. This 

implies that the integral in "Jg " in ( 17.30b is also finite. This completes the proof of ( 17.28b . D 

Proposition 7.2 Let X ~ {Xk}kez be a strictly stationary time series which satisfies Conditions 1 and 2 in Section\3\ 
above. Suppose that j, ci{x)x^°'~^'^ dx < 0, for some 5 > Q. 

Then, with Mn '■= maxi<fe<„ X^, we have E| ln(Af,i)|^ < 00, for all p > and all sufficiently large 71 G N. 
Moreover, for any p > and k Cz N, we have: 

|ln(M„/ni/")|P-E|ln(Z)|P ^ Oin^'^), and E(ln(A/„/7ii/"))'^ - E(ln(Z))'^' = Oin~^), 

as n ^ 00, where Z is an a—Frechet random variable with scale coefficient 0^' "c^/". 

Proof: It is enough to show that the functions f{x) :~ | ln(a;)|'' and f{x) := (In(.T))'^, p > 0, k E N satisfy the con- 
ditions of Proposition [tTT] In the first case, for example, |/'(a;)| = px~^\\n{x)\P~^ , x > 0. Therefore, the assumption 
J^ ci{x)x~°'~^'^^ dx < 00 implies ( 17.27b . since |ln(x)|P^^ < const x'' , for all x S [l,oo). The conditions ( 17.25b and 
( 17.26b are also fulfilled in this case, and hence Proposition 17 . II vields the desired order of convergence. The functions 
f{x) = (In(x))*'', fc e N can be treated similarly. D 



Note that, under the assumptions of Proposition l7.2l we readily obtain: 

E{Yj - j/a) = E\og^{D{j,k)/V/'') ^E\og^{9^/''c]/''Z,) + 0{l/2^"), (7.31) 

as j -> 00, where Zi is a standard a—Frechet variable. This important fact is used in the proofs of the asymptotic 
results given below. 

Proof of Proposition laTH Recall that by ( |Z2] i, 

2 J' 2'~m 

D{j,k):^\/ X2,[k-i)+i and introduce D{j,k) := \/ X2j(k-i)+t- (7.32) 

Observe that D{j, k), k = 1, ■ • • , nj (rij = [n/2^J) are independent in k since they are "separated by m" block- 
maxima of the m— dependent process X. 
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Recall also that by ( |Z4] i 

-, "j _ 1 "^ _ 

Yj := — 2_. log2 -C'(j, fc) and introduce the statistics Yj :— — yj logj D{j, k). 
"■J fe=i '^■^ fe=i 

We first estabUsh Relation ( 13.17b . Let 

i _ ^ _ 

iJ = ^ Wi>^j+j(n) , and H = ^ Wii^j+j(n) , (7.33) 



i=0 



so that a(j) in ( 13.151 ) equals 1/H. The weights w^'s, the range f and the quantity j{n) are described in Section[3] 

Toprovethat c5;(j)— a = Op{an), n — >■ cx), forsomea„ -> 0, it suffices to show that E(// — H)^ = ©(a^J, where 
if := 1/a. Observe that by adding and subtracting the term H, and by applying the inequaUty (a + 6)^ < 2a^ + 26^, 
we get 

E{H - Hf < 2E{H - Hf + 2E(F - Hf = 2Var(iJ - ii) + 2(EiJ - E^)^ + 2E(i? - Hf 

=: 2^1 + 2^2 + 2^3, (7.34) 

where in the last relation we also used the fact that E^^ = Var(^) + (E^)^. 

We will first show that Ai = o{l/nj) in i[7.34}l is negligible. Indeed, by ( 17.331 1, we have 

e 
H - H ^^ Wi{Y,+j(^ri) - Yi+j{n)), (7.35) 

and thus by using the inequahty Var(^o + ^ U) < (^ + l)^(Var(^o) H + Var(^f)), we get Var(ii - H) < 

(1 + i)^ X)i;=o ^f Var(Fi_|_j(„) — Yi^j(^n))- Thus, by Lemma ITTI below. since I is fixed, 

Var(iJ -H)< S^ ^ Var(log2 D{i+j{n), 1) - log^ i?(i + j(n), 1)), (7.36) 

■J 1=0 

where rij ~ n/2''("^. Lemmas I7.2l and l731 on the other hand, yield 

Var(i? - H) = o{l/nj), as n ^ oo. (7.37) 

Now, we focus on the term A2 in ( I7.i4l ). By ( 17.351 ), we have 

VA^ = ^m,(Ey,H.j(„) -Ey,+,(„)) =E^u;,log2(^(*+.7N,l)/2<'+''^""/") 

i=0 2=0 

-E^u;aog2(5(*+j(n),l)/2(*+j("))/") 

4=0 

= 5]m,Elog2(Z) + o(l/2^"(")^) 

4=0 

i _ , 

-^^,(Elog2p(*+j(n),l)/(2*+^(") -m)i/") - -log2((2*+^(") -™)/2'+-''("))), 
4=0 "^ 

where the last relation follows from (17.311 ) and where Z is an a— Frechet variable with scale coefficient (dcxY^'^- Now, 
since D{i + j{n), l)/(2*+^^"^ — m)^/" is a properly normalized block-maximum (recall (17.321 ) above), by Relation 
(17.311 ). we further have that 

e e 

^ = ^^,Elog2(Z)-^m,Elog2(Z)+o(l/2^(")«) + 0(log2(l-m/2J(")) 

4 = 1=0 

= o(l/2^'(")'^) + 0(l/2J'(")), 

23 



as j{n) — > cx), since log2(l — x) = 0{x), a; ^ 0. We thus have, 

^2 = 0(1/2^'^"'"""^'^'^^), asj{n)^oo. (7.38) 

Consider now the term A^ in Ii7.34i . As above, we have 

E(F - Hf = Yia{H - H) + {EH - Hf =: A'^ + A'^, 

and as in ( |7.36l l, we get A3 < {£ + 1)^ J2i=o ^i'^^^O^i+jin) ) — o{l/nj) ~ o(2^'")/n), as Uj — > cx). Also, as argued 
above, since X],;=o ""^'(^ + Ji''^))/'^ = l/^^ = ^> we obtain 

EH-H = ^m,(Elog2 5(z + j'H, 1) - {i+j{n))/a) = 0(i/2^-(«)min{i,«)^ 

•t=0 

as j{n) — )• 00 (see ( |7.38t above). By combining the bounds for terms Ai, A2 and A3 in ( |7.37| l. (|7.38l l and the last two 
relations, we obtain 



H = H + C'p(l/2^(")™"^i''5>) + Op(2J(")/V"^''^), as j{n), n/2^^"^ 



00. 



This completes the proof of the first asymptotic relation in (I3.17l i. 

The proof of the second asymptotic relation in f |5. 1 7i is simpler By introducing the quantity C{i) := Yj — j /a. 



we have 



C^(j) - C{j) = r, - >^. = - E log2(^(j, k)/D{j, k)). 






J I — 1 



One can similarly show that Var(C(j) — C{j)) is of order o(l/nj), as n ^> 00. Thus, the order of C(j) — Cis dictated 
by the orders of the bias and standard error for the quantity C{j). These can be handled as the terms A2 and A^ in 

The following three lemmas were used in the proof Proposition [3.11 
Lemma 7.1 Under the conditions of Proposition \3J\ for all j > log2 m, we have 

Var(rj - y,) < -Var(log2(7^0-, l)/5(j, 1))). 

Tlj 

Proof: For notational simplicity, let ^j. := log2(-D(j, k)/D{j, k)), k = 1, . . . , rij. We have, by the stationarity of ^j. 
in k, that 

TLj —1 

Var(r, - Y^) = — Var(6) + A Yl ("j' " ^)Cov(a+i,a). 

Note that ^fe+i = log2(£'(j, 1 + k)/D{j, 1 + fc)) and ^1 = log2(D0', l)/D{j, 1)) are independent if fc > 1. Indeed, 
this follows from the fact that the process X is ttt,— dependent, and since ^^+1 and ^1 depend on blocks of the data 
separated by at least 2^ > m lags. Therefore, only the lag-1 covariances in the above sum will be non-zero and hence 

Var(yj -Yj)< -Var(a) + - Cov(6,6) < — Var(a), 

Tlj Uj tlj 

since by the Cauchy-Schwartz inequality we have |Cov(a,a)| < Var(a)^/^Var(a)^/^ = Var(a). This completes 
the proof of the lemma. D 

Lemma 7.2 For D(j, k) and D(j, k), defined in \7.32^ above, for any fixed k, we have D(j, k)/D(j, k) — > 1, as 

j -^ 00. 
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Proof: Let 5 E (0, 1/a) be arbitrary and observe that 

V{D{j, k)/D{j, k) <l}^ F{R > b{j, k)] < F{R > 2^^} + V{2^^ > D{j, fc)}, (7.39) 

where R = maxi<i<m X2i(k~i)+i- Now, by stationarity, 

F{R > 2^^} = P{ max Xi > 2'^} -^ 0, as j -^ oo. 

l<i<m 

On the other hand. Relation (I3.12l i impUes that 2^-'' "I)(j, fc) ^- Z, as n — !• oo, where Z is a non-degenerate 
a— Frechet variable. Thus, since 6 E (0, 1/a), we have that 

P{2J'^ > D{j, k)} -> 0, as j -^ oo. 

The last two convergences and the inequality il.39i imply that P{D{j, k)/D{j, k) < 1} — > 0, j -^ oo. Since trivially 
P{D{j, k)/D{j, fc) > 1} = 1, we obtain D{j, k)/D{j, k) converges in distribution to the constant 1, as j — ?► oo. This 
completes the proof since convergence in distribution to a constant implies convergence in probability. D 



Lemma 7.3 The set of random variables 



, j,k E N is uniformly integrable, for all p > Q, 



\og2[D{j,k)/D{j,k)^ 
where D{j, k) and D{i, k) are defined in A7.32\l . 

Proof: Let q > phe arbitrary. By using the inequality \x + y\'^ < 2'^{\x\'' + |y|'), x,y eM., we get 

Dij,k) 1 



E 



logo — 

D{j,k) 



< 2''E|log2(i^(j,fc)/2^>)r + 2'E|log2p(j, fc)/2^>)r. 



In view of Proposition 17. 2| applied to the block-maxima D{j, k) and D(j, k), we obtain 

E| log2p(j, fc)/2-'"/")|9 = E| log2(M2./2J/a)|? ^ const, as j ^ ex.. 

Thus the set {E| log2(D(j, fc)/2J/")|«, j, fc e N} is bounded. We similarly have that the set {E| log2(I)(j, fc)/2-'/")|'?}j,fcgp 
is bounded since log2 (2-' — m) ~ j, j — > oo, for any fixed m. 

We have thus shown that 

D{j,k) 1 
sup E log2 ~ < oo, 

j.feGN D(j, fc) 

for q > p, which yields the desired uniform integrabiUty. D 

Proof of Lemma Ism Suppose that the indices ii, . . . ,ik are drawn without replacement. Let Ai ^ Q and 

Aj -.^{iuEn : \i,,{lu) - t,„iu:)\ > m, for all / ^ j" , 1 < /,/' < j}, (7.40) 

for j > 2, that is, Aj is the event that the first j random indices are spaced further away from each other by at least m 
lags. By convention, we let Ai denote the almost sertain event, so that P(Ai) = 1. 

We need to show F{Ak) > 1 — mk'^ /{n — fc). Note that, since P(Ai) = 1 by convention, for all j > 1, we obtain 

P(A,+i) = P(A,+i|A,)P(A,) > (1 - 2mj/{n - j))P{A,). (7.41) 

Indeed, the probability P(ylj+i jAj) of choosing the index ij+i to be within m lags from at least one of the chosen j 
indices ii, . . . , ij is at most 2mj /{n — j). Thus, 

fc-i fc-i 

P{Au) = \{P{Aj+,\A,)P{A,) > llil-2mj/in-j)). 
i=i j=i 

Now, by the inequality n7=i (1 ~ 2;^) > 1 — J2i=i ^i' valid for all Xj E [0, 1], we obtain 

fc-i 
fc) > 1 - X] 2TOj7(n - j) > 1 - mkik - l)/(n - fc) > 1 - ■mk'^/{n - fc). (7.42) 
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The case when the indices are drawn with replacement is similar D 

Proof of Theoreni l3.lt Consider either a bootstrap or a permutation sample X^ = Xi^ , I = 1, . . . ,k, where ii, . . . ,ik 
are randomly chosen indices from {1, . . . , n}, independently from the original data Xi, . . . , X„. In the case of boot- 
strap these indices are chosen with replacement and in the case of permutations - without replacement, respectively. 

Let the event A^ be defined as in (|7.40| i. which corresponds to the indices being spaced by at least tti— lags away 
from each other Thus, since the time series X — {Xi}i^2 is m— dependent, 

{X*, ■ ■ ■ ,X^)lAk = (^1,- • • ,Xk)lAk, 

where Xi, / = 1, . . . , fc are iid random variables with the same distribution as the X„'s which are independent from the 
event A^. Observe that the event Aj, is also independent from the time series X since it depends only on the random 
indices ii, . . . ,ik- Further, note that in the last relation, we have only equality in distribution and not equality almost 
surely. 

Now, by Lemma [TTl we have P(Afc) — )• 1, as fc ^- oo, since k{n) — o{y/n). Thus, Lemma IT4l implies that 
any statistic based on the bootstrap or the randomly permuted sample will have the same limiting distribution as the 
corresponding statistic based on the iid sample {Xi}i<i<k- 

Let C*{j) = Yj — j/a be defined as the quantity C*{j) in ( 13.16b . but where now Yj is the max-spectrum based 
on the iid data Xi, . . . , Xk- Theorem 4. 1 in lStoev et al\ (120061) implies that 



kj{C*{i)-C*)-^N{Q,ac*), ask^oo, (7.43) 

where Cp. is as in Theorem l3.1l As argued above. Lemma 17341 and Relation ( 17.431 ) imply ( 13.181 ). which completes the 
proof of the theorem. D 

Lemma 7.4 Let X„, X and Y„ be real random variables such that Xn — > X, as n — !• oo. Let also An and i?„ be 
some events such that Y„1b„ = Xn^A^- If^{An) — P{Bn) — ?■ 1, ri — >■ oo, then y„ — !• X, fli n — > oo. 

Proof: Let / : M — > M be an arbitrary bounded and continuous function. Since E|/(y„)lB^ | < constP(_B^j) = o(l), 
as n — ?> cx), we have 

E/(r„) = E/(y„)lB„ + o(l) = E/(X,OU„ + o(l) = E/(X„) + o(l), as n -> oo. 

This shows that lim„^oo E/(y„) = lim,i^oo E/(X„), which completes the proof. D 
Proof of Theorem l3.21 Recall relation (13.16b and observe that by Proposition l3.1l we have 

a{j) =a + Op{b„), and C{j) = C + Op{h,,), 

as n -^ oo, where 6„ = 1/2J'('=(")) """{I'/S} + 2J(fc("))/2/^i/2^ ^Iso, by TheoremlTTI we have a-^{C*{j) - C*) -^ 
7V(0, o-p.)' as n ^ oo, where a„ == l/V^ = 2-''('=("))/2/fc(n)i/2. Relation ( 13.19b . implies that 5„ = o{an), n -^ oo. 
Indeed, since k{n) ~ o{n), n ^ oo, we have 2-'''""(")'/^/n^/^ = o(2-'''"'("^'/^("-)^^^) = o{an), as ii —)■ oo. This 
shows that the second term of 6„ defined above is negligible with respect to a„. By Relation (13. 19b . we also have 
;,/2J('=)(i+2n"n{i^/3}) -> Q, as fc -> OO, or, equivalent^ i/2i('=)«"n{i.'3} = o{23^''^/^ /k^/^), as fc ^ oo. Hence, the 
first term of 6„ defined above is also of order o(2-''''"("^'/^/fc(n)^/^) = o{an), as n ^ oo. 

Now, by using the fact that fe„ = o{an), n -^ oo and the 'Delta-method' applied to the function f{x, y, z) = 
2^(2/-^) and xq = a, y^ = C and zo = C* (see also (13.16b ). we obtain 

cin\m -0)^ dj{a, C, C*) Z ^ AA(0, al), 
as n — > oo. Since dzf{xo, yo, zq) ~ — \n{2)a9, we obtain 



where Z is a 1— Frechet variable (see Theorem 13. lb . Since ln(2) log.,(Z) has th e standard Gumbel distribution, it 
follows that ln(2)2Var(log2(Z)) = tt^/G (see e.g. (22.31) in lJohnson et all (Il994 ). This completes the proof of the 
theorem. D 
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